Data Summarization Using Maximal Marginal Relevance Method
نویسندگان
چکیده
The search for interesting information in a huge data collection is a tough job frustrating the seekers for that information. The automatic text summarization has come to facilitate such searching process. Automatic text summarization is to compress an original document into an abridged version by extracting almost all of the essential concepts with text mining techniques. The selection of distinct ideas “diversity” from the original document can produce an appropriate summary. Incorporating of multiple means can help to find the diversity in the text. Here, the approach for text summarization, in which three evidences are employed (clustering, binary tree and diversity based method) to help in finding the documents distinct ideas. The emphasis of the approach is on controlling the redundancy in the summarized text. The role of clustering is very important, therefore used K-means clustering algorithm. DocumentSentence Tree is build on which the MMI approach is applied to get the quality text Summary.
منابع مشابه
Summarization: (1) Using MMR for Diversity- Based Reranking and (2) Evaluating Summaries
This paper 1 develops a method for combining queryrelevance with information-novelty in the context of text retrieval and summarization. The Maximal Marginal Relevance (MMR) criterion strives to reduce redundancy while maintaining query relevance in reranking retrieved documents and in selecting appropriate passages for text summarization. Preliminary results indicate some benefits for MMR dive...
متن کاملQuery-Focused Multidocument Summarization Based on Hybrid Relevance Analysis and Surface Feature Salience
Query-focused multidocument summarization is to synthesize from a set of topic-related documents a brief, well-organized, fluent summary for the purpose of answering an information need that cannot be met by just stating a name, date, quantity, etc. In this paper, the task is essentially treated as a sentence retrieval task. We propose a hybrid relevance analysis to evaluate the relevance of a ...
متن کاملQuery Snowball: A Co-occurrence-based Approach to Multi-document Summarization for Question Answering
We propose a new method for query-oriented extractive multi-document summarization. To enrich the information need representation of a given query, we build a co-occurrence graph to obtain words that augment the original query terms. We then formulate the summarization problem as a Maximum Coverage Problem with Knapsack Constraints based on word pairs rather than single words. Our experiments w...
متن کاملExtractive summarization of meeting recordings
Several approaches to automatic speech summarization are discussed below, using the ICSI Meetings corpus. We contrast feature-based approaches using prosodic and lexical features with maximal marginal relevance and latent semantic analysis approaches to summarization. While the latter two techniques are borrowed directly from the field of text summarization, feature-based approaches using proso...
متن کاملVideo Summarization Based on Balanced AV-MMR
Among the techniques of video processing, video summarization is a promising approach to process the multimedia content. In this paper we present a novel summarization algorithm, Balanced Audio Video Maximal Marginal Relevance (Balanced AV-MMR or BAV-MMR), for multi-video summarization based on both audio and visual information. Balanced AVMMR exploits the balance between audio information and ...
متن کاملPutting the User in the Loop: Interactive Maximal Marginal Relevance for Query-Focused Summarization
This work represents an initial attempt to move beyond “single-shot” summarization to interactive summarization. We present an extension to the classic Maximal Marginal Relevance (MMR) algorithm that places a user “in the loop” to assist in candidate selection. Experiments in the complex interactive Question Answering (ciQA) task at TREC 2007 show that interactively-constructed responses are si...
متن کامل